Skip to content

Conversation

prestonvasquez
Copy link
Member

@prestonvasquez prestonvasquez commented Mar 5, 2025

GODRIVER-3173

Summary

Move pending response logic into the foreground.

To test locally you will have to check out the branch here for testdata/specifications: mongodb/specifications#1675

Background & Motivation

From the specifications update:

When a connection is checked out after a network timeout, the driver now attempts to resume and complete reading any pending server response (instead of closing and discarding the connection). This may require multiple checkouts.
Each pending response read is subject to a cumulative 3-second static timeout. The timeout is refreshed after each successful read, acknowledging that progress is being made. If no data is read and the timeout is exceeded, the connection is closed.

To reduce unnecessary latency, if the timeout has expired while the connection was idle in the pool, a non-blocking single-byte read is performed; if no data is available, the connection is closed immediately.
This update introduces new CMAP events and logging messages (PendingResponseStarted, PendingResponseSucceeded, PendingResponseFailed) to improve observability of this path.

@mongodb-drivers-pr-bot mongodb-drivers-pr-bot bot added the review-priority-low Low Priority PR for Review: within 3 business days label Mar 5, 2025
Copy link
Contributor

mongodb-drivers-pr-bot bot commented Mar 5, 2025

API Change Report

./v2/event

compatible changes

ConnectionPendingResponseFailed: added
ConnectionPendingResponseStarted: added
ConnectionPendingResponseSucceeded: added
PoolEvent.RequestID: added

./v2/x/mongo/driver

compatible changes

RetryablePendingResponseError: added

./v2/x/mongo/driver/topology

incompatible changes

BGReadCallback: removed
BGReadTimeout: removed

compatible changes

PendingResponseTimeout: added

@prestonvasquez prestonvasquez changed the title (POC V2) DRIVERS-2868 Complete pending reads on conn checkout (POC V2) GODRIVER-3173 Complete pending reads on conn checkout Mar 5, 2025
@prestonvasquez prestonvasquez requested a review from Copilot March 19, 2025 17:45
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request implements changes to support completing pending reads on connection checkout. Key changes include:

  • Adding new YAML tests for client‐side operation timeouts related to pending reads.
  • Updating connection and pool logic to handle pending reads with context values (e.g. maxTimeMS and requestID) and synchronizing background reads.
  • Introducing new monitoring events and updating event verification and test instrumentation to track pending read events.

Reviewed Changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
testdata/client-side-operations-timeout/pending-reads.yml New YAML tests for handling operation timeouts and pending read scenarios.
internal/integration/client_test.go Updated concurrent operation tests to account for pending reads.
internal/driverutil/context.go Added context helper functions for maxTimeMS and requestID propagation.
x/mongo/driver/topology/pool.go Updated pool checkout logic to await completion of pending reads and renamed timeout constants.
x/mongo/driver/topology/pool_test.go Revised pool tests to verify new pending read behaviors.
internal/integration/unified/event_verification.go Added fields and tests for verifying new pending read pool events.
event/monitoring.go Extended pool event monitoring with pending read event support.
x/mongo/driver/operation.go Propagated context values in operation execution for pending read handling.
internal/integration/csot_test.go Updated CSOT tests to ensure connection closure conditions with pending reads.
Comments suppressed due to low confidence (2)

internal/integration/unified/event_verification.go:61

  • Field name 'ConnectionPendingreadSucceeded' should be 'ConnectionPendingReadSucceeded' to maintain consistent capitalization with the other pending read event fields.
ConnectionPendingreadSucceeded *struct{} `bson:"connectionPendingReadSucceeded"`

x/mongo/driver/topology/pool.go:908

  • Typo in comment: 'alawys' should be corrected to 'always'. Consider removing or clarifying this inline question to avoid confusion.
if size == 0 { // Question: Would this alawys equal to zero?

@prestonvasquez prestonvasquez changed the title (POC V2) GODRIVER-3173 Complete pending reads on conn checkout GODRIVER-3173 Complete pending reads on conn checkout Apr 30, 2025
@prestonvasquez prestonvasquez marked this pull request as ready for review April 30, 2025 22:40
@prestonvasquez prestonvasquez requested a review from a team as a code owner April 30, 2025 22:40
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements pending response handling by moving the pending read logic into the foreground and replacing the legacy BGReadCallback mechanism. Key changes include renaming the test function from TestBackgroundRead to TestAwaitPendingRead, replacing the awaitRemainingBytes field with a new pendingResponseState (with corresponding context values), and updating pool and connection logic to support pending response events.

Reviewed Changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.

File Description
x/mongo/driver/topology/pool_test.go Renamed test functions and updated assertions to use pendingResponseState instead of awaitRemainingBytes
x/mongo/driver/topology/pool.go Replaced BGReadTimeout with PendingResponseTimeout and integrated awaitPendingResponse logic with new event publications
x/mongo/driver/topology/connection.go Replaced awaitRemainingBytes with pendingResponseState and updated read methods accordingly
Other test and integration files Adjusted test expectations and event verifications for pending read events
Comments suppressed due to low confidence (1)

x/mongo/driver/topology/connection.go:929

  • [nitpick] Consider removing or clarifying the inline comment about whether size would always equal zero to avoid confusion in future maintenance.
if size == 0 { // Question: Would this alawys equal to zero?

_, err = conn.readWireMessage(ctx)
regex := regexp.MustCompile(
`^connection\(.*\[-\d+\]\) incomplete read of message header: context deadline exceeded: read tcp 127.0.0.1:.*->127.0.0.1:.*: i\/o timeout$`,
)
assert.True(t, regex.MatchString(err.Error()), "error %q does not match pattern %q", err, regex)
assert.Nil(t, conn.awaitRemainingBytes, "conn.awaitRemainingBytes should be nil")
close(errsCh) // this line causes a double close if BGReadCallback is ever called.
assert.Nil(t, conn.pendingResponseState, "conn.awaitRemainingBytes should be nil")
Copy link
Preview

Copilot AI Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the error message to refer to 'conn.pendingResponseState' instead of 'conn.awaitRemainingBytes' for clarity and consistency with the new field name.

Suggested change
assert.Nil(t, conn.pendingResponseState, "conn.awaitRemainingBytes should be nil")
assert.Nil(t, conn.pendingResponseState, "conn.pendingResponseState should be nil")

Copilot uses AI. Check for mistakes.

// them to all return a timeout error because the failpoint
// blocks find operations for 500ms. Run 50 to increase the
// blocks find operations for 50ms. Run 50 to increase the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has been changed to 150ms.

@github-actions github-actions bot added the review-priority-normal Medium Priority PR for Review: within 1 business day label Sep 3, 2025
Copy link
Contributor

🧪 Performance Results

Commit SHA: 7ed2567

The following benchmark tests for version 68b8792e15924f0007e0a657 had statistically significant changes (i.e., |z-score| > 1.96):

Benchmark Measurement % Change Patch Value Stable Region H-Score Z-Score
BenchmarkMultiInsertSmallDocument total_time_seconds 41.0325 1.4937 Avg: 1.0591
Med: 1.0466
Stdev: 0.0421
0.9519 10.3191
BenchmarkBSONFlatDocumentEncoding ops_per_second_min -34.0749 2313.9150 Avg: 3509.9149
Med: 3628.6050
Stdev: 587.8170
0.7478 -2.0346
BenchmarkSingleRunCommand ops_per_second_min -23.1776 1234.6685 Avg: 1607.1729
Med: 1632.1494
Stdev: 185.7210
0.7601 -2.0057
BenchmarkSmallDocInsertOne total_bytes_allocated -22.6419 26918888.0000 Avg: 34797753.3333
Med: 35363064.0000
Stdev: 1865644.6942
0.8730 -4.2231
BenchmarkSmallDocInsertOne total_mem_allocs -22.5725 355536.0000 Avg: 459185.6111
Med: 466638.0000
Stdev: 24493.0054
0.8734 -4.2318
BenchmarkMultiInsertSmallDocument ns_per_op 18.7372 7838.0000 Avg: 6601.1307
Med: 6593.0000
Stdev: 160.5994
0.9298 7.7016
BenchmarkMultiInsertSmallDocument total_mem_allocs 17.9456 2667953.0000 Avg: 2262019.2161
Med: 2235865.0000
Stdev: 112401.1306
0.8627 3.6115
BenchmarkSmallDocInsertOne ns_per_op 17.8417 228272.0000 Avg: 193710.6667
Med: 187705.0000
Stdev: 12121.6461
0.8138 2.8512
BenchmarkMultiInsertSmallDocument total_bytes_allocated 12.9285 502840376.0000 Avg: 445273050.4575
Med: 444953576.0000
Stdev: 13558462.1194
0.8768 4.2459
BenchmarkSingleFindOneByID ns_per_op 12.3295 292753.0000 Avg: 260619.8750
Med: 255307.0000
Stdev: 15066.2497
0.7585 2.1328
BenchmarkSmallDocInsertOne ops_per_second_med -12.3103 4767.4100 Avg: 5436.6796
Med: 5579.2815
Stdev: 300.9671
0.7626 -2.2237
BenchmarkSingleFindOneByID ops_per_second_med -11.9360 3476.6646 Avg: 3947.8831
Med: 4038.7886
Stdev: 227.6118
0.7493 -2.0703
BenchmarkSingleFindOneByID ops_per_second_max -11.8517 3965.4215 Avg: 4498.5812
Med: 4488.0483
Stdev: 140.3047
0.8593 -3.8000
BenchmarkMultiInsertLargeDocument ns_per_op 9.9649 33975475.0000 Avg: 30896646.9545
Med: 30378134.5000
Stdev: 1398741.7777
0.7898 2.2011
BenchmarkBSONFlatDocumentEncoding ns_per_op 8.3555 16961.0000 Avg: 15653.1111
Med: 15601.0000
Stdev: 391.5740
0.8333 3.3401
BenchmarkBSONFlatDocumentEncoding ops_per_second_med -5.7130 67471.8305 Avg: 71560.0754
Med: 71751.4530
Stdev: 1817.5036
0.7500 -2.2494
BenchmarkBSONFullDocumentDecoding total_mem_allocs -5.4884 8255827.0000 Avg: 8735253.9412
Med: 8748842.5000
Stdev: 160091.2388
0.8193 -2.9947
BenchmarkMultiInsertSmallDocument allocated_bytes_per_op -4.9257 2638.0000 Avg: 2774.6732
Med: 2795.0000
Stdev: 63.2682
0.7645 -2.1602
BenchmarkBSONFlatDocumentEncoding ops_per_second_max -4.9061 71602.4631 Avg: 75296.5511
Med: 75426.1578
Stdev: 1622.6377
0.7563 -2.2766
BenchmarkBSONFullDocumentDecoding ns_per_op 4.6225 87862.0000 Avg: 83980.0000
Med: 83620.5000
Stdev: 1616.6698
0.7879 2.4012
BenchmarkBSONFlatDocumentDecoding ns_per_op 3.7025 60573.0000 Avg: 58410.3529
Med: 58259.0000
Stdev: 900.9092
0.7688 2.4005
BenchmarkBSONFullDocumentEncoding ns_per_op 3.6690 26574.0000 Avg: 25633.4949
Med: 25538.0000
Stdev: 380.0789
0.7772 2.4745
BenchmarkBSONFlatDocumentEncoding total_time_seconds 3.2576 1.2314 Avg: 1.1926
Med: 1.1924
Stdev: 0.0187
0.7339 2.0822
BenchmarkBSONDeepDocumentDecoding ns_per_op 3.2358 74283.0000 Avg: 71954.6765
Med: 71965.0000
Stdev: 827.7633
0.7997 2.8128
BenchmarkBSONFullDocumentDecoding ops_per_second_med -2.9957 12410.6434 Avg: 12793.9131
Med: 12829.8893
Stdev: 186.6993
0.7443 -2.0529
BenchmarkBSONFullDocumentDecoding ops_per_second_max -1.8156 13091.2328 Avg: 13333.3158
Med: 13311.0600
Stdev: 122.1398
0.7392 -1.9820
BenchmarkBSONDeepDocumentDecoding total_time_seconds 1.1474 1.2131 Avg: 1.1993
Med: 1.1989
Stdev: 0.0058
0.7970 2.3608
BenchmarkLargeDocInsertOne allocated_bytes_per_op 0.2179 5672.0000 Avg: 5659.6667
Med: 5660.0000
Stdev: 5.7701
0.7708 2.1375
BenchmarkBSONDeepDocumentDecoding allocated_bytes_per_op -0.0594 15094.0000 Avg: 15102.9706
Med: 15104.0000
Stdev: 2.7687
0.8958 -3.2400
BenchmarkBSONFullDocumentDecoding allocated_bytes_per_op 0.0088 25326.0000 Avg: 25323.7647
Med: 25324.0000
Stdev: 0.7410
0.8351 3.0168

For a comprehensive view of all microbenchmark results for this PR's commit, please check out the Evergreen perf task for this patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature review-priority-low Low Priority PR for Review: within 3 business days review-priority-normal Medium Priority PR for Review: within 1 business day
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants